{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "####  Bayes Theorem"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 贝叶斯由来\n",
    "\n",
    "周末和朋友在一个大型商业综合体逛街吃饭，选择太多了，去吃什么好呢？好朋友聚在一起不容易，因此吃饭的选择会比较重要。如何选择呢？大家会提出各自建议：选一个口味符合的，选一个装修漂亮的，选一个生意兴旺的，选一个在美团上有很多介绍的，选一个有朋友去过的，诸如此类的建议你肯定需要尽可能的采纳，而且希望得到一个完美的交集。但是显然把所有的建议都遍历一遍的话，估计大家都不等不到吃饭了，而且你能找到收集全所有的建议吗？显然不能。但是选择还是要做的，饭总归要吃的。因此我们的决策是**基于一个不完备的信息集的**。此外，我们的决策过程是将有效的建议，一个一个根据重要性来尝试的，比如，我们认为一个好的餐馆，第一个也是最重要的特征是在网络平台上有好口碑，第二个次重要的特征是当天的生意好、客人多，第三重要的是我们当中有人去过的...... **我们的决策是一个不断的，但同时是有序的数据采集和信息分析的过程**。\n",
    "\n",
    "\n",
    "\n",
    "\n",
    "我们知道生活中绝大多数决策面临的信息都是不全的，我们只有有限的信息。在无法得到全面的信息的情况下，我们就在信息有限的情况下，尽可能做出一个好的预测。这个每天都在进行的过程就是大名鼎鼎的贝叶斯理论(Bayes theorem)，或者是贝叶斯推断(Bayes inference)，贝叶斯法则(Bayes rule).\n",
    "\n",
    "\n",
    ">[Bayes' theorem](https://www.youtube.com/watch?v=HZGCoVF3YvM \"Bayes theorem\"), named after 18th-century British mathematician Thomas Bayes, is a mathematical formula for determining conditional probability. Conditional probability is the likelihood of an outcome occurring, based on a previous outcome occurring. Bayes' theorem provides a way to revise existing predictions or theories (update probabilities) given new or additional evidence.\n",
    "*It is often employed in finance in updating risk evaluation.*\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 贝叶斯公式\n",
    "\n",
    "$$\n",
    "P(B|A) = \\frac{P(B)\\times P(A|B)}{P(A)} = P(B)\\times \\frac{P(A|B)}{P(A)}\n",
    "$$\n",
    "\n",
    "其中分母中的$P(A)$可以写作 \n",
    "\n",
    "$$\n",
    "P(A) = \\sum_{i=1}^{n} P(B_i) \\times P(A|B_i)\n",
    "$$\n",
    "\n",
    "其中$i = 1,2,\\ldots,n$ 是事件B所可能的$n$种状态，或者说多个B状态。\n",
    "\n",
    "我们可以将Bayes公式中的 $P(B|A)$理解为事件B在事件A发生状态下的后验状态，$P(B)$ 是事件B的先验概率。\n",
    "\n",
    "**简单举个例子：**\n",
    "\n",
    "我在考虑今天中午去食堂吃饭还是去宝龙吃饭 $P(B_1)$ 是去食堂吃饭的概率，$P(B_2)$ 是去宝龙吃饭的概率。\n",
    "\n",
    "目前的考虑因素有今天下午要上课，去食堂比较快。上课这个条件是$A$. \n",
    "\n",
    "$P(A|B_1)$ 可以理解在食堂吃饭的情况下有课的概率，$P(A)$ 是上课的概率。那么我们计算今天去食堂吃饭的概率应该是：\n",
    "\n",
    "$$\n",
    "P(B_1|A) = \\text{（今天要上课）准备在食堂吃饭的概率} = \\text{正常去食堂吃饭概率}\\times \\frac{\\text{在食堂吃饭的情况下有课的概率}}{\\text{上课的概率}}\n",
    "$$\n",
    "\n",
    "\n",
    "或者根据中心极限定理，我们将频率作为概率，将上式右边所有的概率使用在一定时间（观测期）该事件发生的次数（频率）来代表。如给定观测期是一年，则**正常去食堂吃饭的概率**就等于**在一年中去食堂吃饭的次数除以一年的天数**。\n",
    "\n",
    "\n",
    "请试试再加上一个条件（事件），**今天算算账好像这个月的钱有点超支**。事件$A_1$是今天下午有课，事件$A_2$是今天钱不多了。在有这两个因素需要考虑的情况下，你会怎么办？\n",
    "\n",
    "**请注意！ 这边是多个A 而不是 B，不要搞错哦**\n",
    "\n",
    "递推贝叶斯算法，Recursive Bayesian estimation\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "\n",
    "\n",
    "### 贝叶斯公式的推导\n",
    "\n",
    "\n",
    "[朴素贝叶斯分类器原理](https://zhuanlan.zhihu.com/p/30241523 \"朴素贝叶斯分类器原理与实战\")\n",
    "\n",
    "我们利用上面的例子来推导一下贝叶斯公式：\n",
    "\n",
    "- 事件 A： 下午有课\n",
    "- 事件 B： 食堂吃饭\n",
    "\n",
    "显然两个事件有相关性，即 $A\\cap B$非空。那我们可以有：\n",
    "$$\n",
    "P(A|B) = \\frac{P(A \\cap B)}{P(B)}\n",
    "$$\n",
    "在食堂吃饭的那些日子里面，下午有课的概率是可以通过将所有日子里面，下午有课同时又在食堂吃饭的日子总数 除以 在食堂吃饭的日子总数  。如果没有问题的话，我们还可以有：\n",
    "$$\n",
    "P(B|A) = \\frac{P(A \\cap B)}{P(A)}\n",
    "$$\n",
    "如果这也没有问题的话，我们可以在两个等式中解出\n",
    "$$\n",
    "P(A \\cap B) = P(A|B)\\cdot P(B) = P(B|A) \\cdot P(A)\n",
    "$$\n",
    "接着我们就得证明了\n",
    "\n",
    "$$\n",
    "P(B|A) = \\frac{P(B)\\cdot P(A|B)}{P(A)}\n",
    "$$\n",
    "\n",
    "继续把证明做到底。 在选择哪里吃饭的问题上，我们假设有两个选择：食堂($B_1$) 和宝龙广场($B_2$)\n",
    "\n",
    "那么我们完成全概率的证明：\n",
    "\n",
    "下午有课的时候，不是去食堂就是去宝龙吃饭\n",
    "\n",
    "$$\n",
    "P(A) = P(A \\cap B_1) + P(A \\cap B_2)\n",
    "$$\n",
    "根据上面的证明，我们有\n",
    "$$\n",
    "P(A \\cap B_1) = P(A|B_1)\\cdot P(B_1)\n",
    "$$\n",
    "\n",
    "因此我们有：\n",
    "$$\n",
    "P(A) = P(A|B_1)\\cdot P(B_1) + P(A|B_2)\\cdot P(B_2)\n",
    "$$\n",
    "\n",
    "好了，如果我们中午去吃饭的地方不止两个，有$n$个地方，那我们就把上式给推广成为：\n",
    "$$\n",
    "P(A) = \\sum_{i=1}^{n} P(B_i) \\cdot P(A|B_i)\n",
    "$$\n",
    "带入之前的公式，我们就有了完整的bayes公式啦\n",
    "\n",
    "$$\n",
    "P(B|A) = \\frac{P(B)\\cdot P(A|B)}{P(A)} = \\frac{P(B)\\cdot P(A|B)}{\\sum_{i=1}^{n} P(B_i) \\cdot P(A|B_i)}\n",
    "$$ \n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "\n",
    "\n",
    "### 贝叶斯理论的应用\n",
    "\n",
    "\n",
    "#### （一）著名的辛普森案：\n",
    "\n",
    "\n",
    "为了证明辛普森有杀妻之罪，控方说辛普森之前家暴的历史，但问题是，辛普森虐待前妻与他有没有谋杀她有什么关系呢？控方的观点是，长期对前妻实施家庭暴力说明辛普森有谋杀前妻的动机。\n",
    "\n",
    "\n",
    "而辩护律师说美国有400万女性被丈夫或男友打过，而其中只有1432人被杀，概率是2500分之一。 \n",
    "\n",
    "\n",
    "不管是控方还是辩方，都在诱导陪审团考虑这样一个条件概率：**在已知丈夫曾经殴打妻子的前提下，丈夫谋杀妻子的概率是多少。**\n",
    "\n",
    "\n",
    "这其实就是误用了条件概率， 因为辩护律师用的条件是家暴，用来推测的事件是丈夫杀人， 而事实上这里的条件是被杀而且有家暴，而要推测的事件是：**在已知丈夫曾经殴打妻子，并且妻子确实死于谋杀的双重前提下，丈夫谋杀妻子的概率是多少。** 这才是贝叶斯分析的正当用法。\n",
    "\n",
    "\n",
    "想象我们的样本是100,000个被丈夫殴打过的妇女。假设辩护律师的数据属实，那么这其中大概有40个妇女最终会被丈夫谋杀（100,000×1/2500 = 40）。我们再假设，另外还有3个妇女被丈夫以外的人谋杀了（这是根据美国联邦调查局于1992年发布的女性被谋杀的数据算出来的）。也就是说，被谋杀的43位女性中，有40个妇女是被对她们实施家暴行为的丈夫杀掉的。因此，**在已知丈夫曾经殴打妻子，并且妻子确实被人谋杀的双重前提下，丈夫谋杀妻子的概率高达93%！**\n",
    "\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### (二) 金融中的应用：\n",
    "\n",
    "**给定沪深300的指数下降的条件下，贵州茅台的价格是否会下降**\n",
    "\n",
    "\n",
    "贝叶斯定理简单地遵循了条件概率公理。条件概率是一个事件发生另一个事件的概率。例如：“贵州茅台股价下跌的概率是多少？”条件概率进一步问：“鉴于沪深300下跌较早，茅台股价下跌的概率是多少？”\n",
    "\n",
    "\n",
    "在B已经发生的情况下，A的条件概率可以表示为：。\n",
    "\n",
    "\n",
    "如果A是 \"茅台价格下跌\"，设$P(MT)$是茅台下跌的概率；而B是: \"沪深300指数已经下跌\"，设$P(300)$是沪深300指数下跌的概率；那么条件概率表达式为：\"在沪深300指数下跌的情况下，茅台下跌的概率等于茅台价格下跌和沪深300指数下跌的概率除以沪深300指数下跌的概率。\n",
    "$$\n",
    "P(MT|300) = \\frac{P(MT, 300)}{P(300)}\n",
    "$$\n",
    "$P(MT, 300)$是A和B同时发生的概率。这也等于在A发生的情况下，A发生的概率乘以B发生的概率，表示为$P(MT) \\times P(300|MT)$。这两个表达式相等的事实导致了贝叶斯定理，该定理被写成。\n",
    "\n",
    "\n",
    "如果，\n",
    "$$\n",
    "P(MT, 300)=P(MT)\\times P(300|MT)=P(300) \\times P(MT|300)\n",
    "$$\n",
    "\n",
    "\n",
    "那么，\n",
    "$$ P(MT|300) = \\frac{P(MT) \\times P(300|MT)}{P(300)}\n",
    "$$\n",
    "\n",
    "\n",
    "其中$P(MT)$和$P(300)$是茅台和沪深300指数下跌的概率，互不相干。\n",
    "\n",
    "\n",
    "该公式解释了在看到证据之前假设的概率$P(MT)$，和得到证据之后假设的概率$P(MT|300)$之间的关系。\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 递推贝叶斯算法，Recursive Bayesian estimation\n",
    "\n",
    "Imagine you have a hypothesis about some phenomenon in the world. What is the probability that the hypothesis is true?\n",
    "\n",
    "The **hypothesis** has some **prior probability** which is based on past knowledge (previous observations). To update it with a **new observation**, you multiply the **prior probability** by the respective **likelihood term** and then divide by the **evidence term**. The updated prior probability is called the **posterior probability**.\n",
    "\n",
    "The posterior probability then becomes the **next prior** which you can update from another observation. And so the **cycle continues**.\n",
    "\n",
    "\n",
    "- Prior probability density $p(\\theta)$\n",
    "\n",
    "Our previous knowledge / hypothesis of $\\theta $ from, for example, a past experience, with the absence of some proof.\n",
    "\n",
    "- Likelihood function $p(z|\\theta)$\n",
    "\n",
    "The likelihood reads as **the probability of the observation, given that the hypothesis is true**. This term basically represents how strongly the hypothesis predicts the observation.\n",
    "\n",
    "So, the higher the likelihood, the higher the posterior probability is going to be.\n",
    "\n",
    "- Normalization factor(Evidence)  $p(z)$\n",
    "\n",
    "The evidence is the probability of a particular statement being true.\n",
    "\n",
    "- Posterior probability density $p(\\theta|z)$\n",
    "\n",
    "The posterior probability is obtained after multiplying the prior probability by the likelihood and then dividing by the evidence.\n",
    "\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "**简单的例子**\n",
    "\n",
    "Hypothesis $\\theta $: \n",
    "1. 感冒  F\n",
    "2. 肺癌  C\n",
    "\n",
    "Evidence $z$：\n",
    "\n",
    "1. 咳嗽合并气促  $E_1$\n",
    "2. 胸痛并反复感染  $E_2$\n",
    "3. 明显的呼吸功能障碍   $E_3$\n",
    "\n",
    "\n",
    "\n",
    "简单回顾一下：\n",
    "\n",
    "如果老王只有 咳嗽合并气促 症状/Evidence: 有两个假设 感冒或者癌症 \n",
    "\n",
    "我们知道如果 Likelihood function $ P(E_1 | C) = P(E_1| F) = 1 $，你应该做什么判断，由于 Prior probability $P(C) < P(F)$, 因此 Posterior probability $P(C|E_1) < P(F|E_1)$, 我们由此判断该病人是患感冒，而不是癌症。\n",
    "\n",
    "**递推贝叶斯算法**\n",
    "\n",
    "考虑一下，医生在排除是普通感冒，而诊断癌症的过程。。。\n",
    "\n",
    "开始情况下，即没有任何evidence的情况下，老王患癌症的概率是 $P(C)$。\n",
    "\n",
    "医生先查是否有  咳嗽合并气促 \n",
    "$$ P(C_1):= P(C|E_1) = P(C) * \\frac{P(E_1 | C)}{P(E_1)} $$ \n",
    "\n",
    "再检查 是否有 胸痛并反复感染\n",
    "$$ P(C_2):= P(C_1|E_2) = P(C_1) * \\frac{P(E_2 | C_1)}{P(E_2)} $$\n",
    "\n",
    "再检查 是否有 明显的呼吸功能障碍\n",
    "$$ P(C_3):= P(C_2|E_2) = P(C_2) * \\frac{P(E_3 | C_2)}{P(E_3)} $$\n",
    "\n",
    "当 $P(C_3)$ 差不多接近于1的时候，医生就只能宣布确诊了\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### （二） 狼来了的故事 \n",
    "\n",
    "《伊索寓言》中有一则“孩子与狼”的故事，讲的是一个小孩每天到山上放羊，山里有狼出没.第一天，他在山上喊“狼来了！狼来了！”，山下的村民闻声便去打狼，可到了山上，发现狼没有来；第二天也如此；第三天，狼真的来了，可无论小孩怎么喊叫，也没有人来救他，因为前两天他说了慌，人们不再相信他了。\n",
    "\n",
    "**试用贝叶斯公式来分析此寓言中村民对这个小孩的可信度是如何下降的.**\n",
    "\n",
    "首先设： 事件 A 是小孩子说谎，$P(A)$ 是孩子说谎的概率； 事件 B 是 村民相信孩子，$P(B)$ 是村民相信孩子的概率\n",
    "\n",
    "相应的输入还有 $P(A|B)$ 是在村民相信孩子的条件下孩子说谎的概率，$P(A|\\overline{B})$ 是在村民不相信孩子的情况下孩子说谎的概率.  \n",
    "\n",
    "根据常识判断，在开始村民们是相信孩子的，因此可以设一个比较积极的先验概率 $P(B) = y = 0.8$, 然后在村民相信孩子的情况下，小孩说谎的概率较低（不好辜负大家的信任），因此有$P(A|B) = a = 0.1$， 在村民不相信孩子的情况下，小孩的说谎的概率相对较高（反正已经是不信我了），因此有$P(A|\\overline{B}) =  b = 0.5$。\n",
    "\n",
    "这样我们有 $P(A) = P(A|B)\\times P(B) + P(A|\\overline{B}) \\times P(\\overline{B})$\n",
    "\n",
    "\n",
    "其中就有 $ P(\\overline{B}) = 1 - P(B)$\n",
    "\n",
    "\n",
    "孩子的信用度的变化就是 $P(B)$ 在一次一次说谎后变化的规律，我们可以通过编程来看 \n",
    "\n",
    "$$\n",
    "P(B|A) = P(B) \\times \\frac{P(A|B)}{P(A|B)\\times P(B) + P(A|\\overline{B}) \\times P(\\overline{B})}\n",
    "$$\n",
    "\n",
    "下面我们用代码解决一个递归过程："
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {},
   "outputs": [],
   "source": [
    "def bayes_inf(prior_prob, p_a_b, p_a_xb):\n",
    "    # prior_prob是先验概率； p_a_b 为 P(A|B)， p_a_xb 为 P（A|~B）  均为标量\n",
    "    return prior_prob * p_a_b / (p_a_b * prior_prob + p_a_xb * (1-prior_prob))\n",
    "\n",
    "def aggregate_bayes(prior_prob, p_a_b, p_a_xb, n):\n",
    "    post_prob = [prior_prob] # 定义一个列表，return时该列表长度等于 n + 1\n",
    "    p_prob = prior_prob\n",
    "    for i in range(n):\n",
    "        p_prob = bayes_inf(p_prob, p_a_b, p_a_xb)\n",
    "        post_prob.append(p_prob)\n",
    "    return post_prob\n",
    "           "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "[0.8, 0.44444444444444453, 0.13793103448275867, 0.03100775193798451]"
      ]
     },
     "execution_count": 4,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# 孩子不断说谎的情况，村民对他的信任度越来越差\n",
    "# 设开始状态 村民比较信任孩子 P(B) = prior_prob = 0.8\n",
    "# 在村民相信孩子的情况下，小孩说谎的概率较低（不好辜负大家的信任），因此有 P(A|B) = 0.1， \n",
    "# 在村民不相信孩子的情况下，小孩的说谎的概率相对较高（反正已经是不信我了），因此有 P(A|\\overline{B}) = 0.5。\n",
    "# P(A) = 0.1 * 0.8 + 0.5 * 0.2 = 0.18\n",
    "\n",
    "aggregate_bayes(0.8, 0.1, 0.5, 3) "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 11,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "[0.03,\n",
       " 0.04433497536945813,\n",
       " 0.06506024096385543,\n",
       " 0.0945157526254376,\n",
       " 0.1353760445682452,\n",
       " 0.19019045134359522,\n",
       " 0.26051220964860056,\n",
       " 0.3457343099541553,\n",
       " 0.4421655621700554,\n",
       " 0.5431641110078835,\n",
       " 0.6407342436024961,\n",
       " 0.727904648286461,\n",
       " 0.8005096315338909]"
      ]
     },
     "execution_count": 11,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# 孩子修复信用的过程，通过不断说真话让大家重新开始信任他\n",
    "# 设开始状态 村民不信任孩子 prior_prob = 0.03 = P(B)\n",
    "# 在村民相信孩子的情况下，小孩说真话的概率较高（不好辜负大家的信任），因此有 P(A|B) = 0.9， \n",
    "# 在村民不相信孩子的情况下，小孩的说真话的概率相对较低（反正已经是不信我了），因此有 P(A|\\overline{B}) = 0.6。\n",
    "# P(A) = 0.9 * 0.03 + 0.7 * 0.97 = 0.706\n",
    "\n",
    "aggregate_bayes(0.03, 0.9, 0.6, 12) "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": []
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.7.0"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 4
}
